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ABSTRACT 

We develop a new approach to random walks on de Brnijn graphs over the alphabet 
A through right congruences on , defined using the natural right action of A+. A 
major role is played by special right congruences, which correspond to semaphore 
codes and allow an easier computation of the hitting time. We show how right 
congruences can be approximated by special right congruences. 


1 Introduction 

In graph theory, a fe-dimensional de Bruijn graph over the alphabet A is a directed graph representing 
overlaps between sequences of symbols [9, 10]. The de Bruijn graph has \A\^ vertices, given by 
all words of length k in the alphabet A. There is an edge from vertex ai... G A^ to vertex 
02 ... OfeO G A^ for every a G A. An important question for cryptography and networking is that of 
de Bruijn sequences. A de Bruijn sequence is a cyclic word of length |A|*^ such that every possible 
word of length k over the alphabet A appears once and exactly once (see [16] for a review on de 
Bruijn sequences). Obviously, a de Bruijn sequence corresponds to a Eulerian path in the de Bruijn 
graph. 

Here we are interested in random walks on the de Bruijn graph T. To an edge v —^ tc in T we 
associate a probability 0 < 7r(a) < 1, satisfying — f- This gives rise to the de Bruijn- 

Bernoulli process (see for example [5, 2]): if we are at vertex u at a given time, then with probability 
7r(a) we go to vertex w where v —^ w is an edge in T. The transition matrix T = {Tv,w)v 
encodes the transition probabilities, that is, = ^(o) if v w. Given a random walk, an 
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important question is to determine the stationary distribution, which intuitively is the state that is 
reached after taking many steps in the random walk. Mathematically, the stationary distribution 
is the vector I such that IT = I. In other words, I is the left eigenvector of T with eigenvalue 
one. In the case of the de Bruijn-Bernoulli random walk, the stationary distribution I £ is 
multiplicative [5] 

a£w 

We can reformulate the random walk on the de Bruijn graph in algebraic terms. Namely, let us 
define the right action of A on A^ by 


ai... a^.a = 02 ... a^a 

for ai ... Ofc £ A^ and a £ A. This induces the action of the semigroup F{\A\, k) ;= A^UA'^U- ■ -UA^ = 
A-^ of all words in A of length 1,2,... ,k with the multiplication • being concatenation and taking 
the last k letters if the length is bigger than k. For example, il A = {a, b} and k = 3, we have 
ab ■ ba = bba in F{2,3). In this formulation, it is clear that the walk in j steps given by ai • • • aj 
acts as a constant map (i.e., is independent of the initial vertex) if and only if j = k. We call such 
elements resets. 

Random walks on de Bruijn graphs are a “classical” subject. However, in applications it is right 
congruences^ [1, 14, 15, 19] on A^ (denoted by RC(H^)) under the faithful action of F{\A\,k) and 
the associated random walks on their congruence classes that are important. Intuitively, these are 
the finite semigroups for which any product of k elements act like constant maps on A^, but because 
of the right congruence some products of length less than k might be constant. Right congruences 
are a standard idea in finite state machines or finite automata theory [18]. In finite state machines, 
they are used in passing to the unique minimal automata doing the same computation. For example, 
assume one has a stream of data (e.g. chemical data on waste water being emptied into a river). 
Assume that there exist a positive integer k, so that only the k most recent symbols of data matter. 
Then there is a function f:A^^D, where D is the data set. The function could be of the form 
/(ai,... jOfc) is ok or not ok (that is, D is a two element set) depending on whether this recent k 
long data meets EPA standards. Then the function / gives an equivalence relation ~ on A^ given 
by S t if and only if f(s) = f{t). In addition, there is a unique maximal refinement of ~ which 
is a right congruence (that is, the best lower approximation by a right congruence) R, namely sRt 
for s,t £ A^ if and only if for all strings u £ A* we have s.u ~ t.u or equivalently f{su) = f{tu). 
Here . is the multiplication in F{\A\,k). Then {A^/R,F{\A\,k)) can compute the function / since 
/ factors through the R classes (take ?x to be 1). See [18] for more details. 

Consider the right congruence in RC(A^) with A = {a, b} defined by the congruence classes 

{aaa, baa, aba}, {bba}, {aab, bab}, {abb}, {bbb}. (1.1) 

It is not hard to check that il w,v £ A^ are in the same congruence class, then w ■ z and v ■ z 
for z G F{2,3) are also in the same congruence class, proving that (1.1) is indeed in RC(A^). The 

^An equivalence relation is a right congruence if it preserves the right action of a semigroup. See Definition 2.2 for 
more details. 
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Figure 1.1: The transition graph for the congruence of Equation (1.1). 
transition graph is given in Figure 1.1 and the transition matrix of the associated random walk is 
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By lumping [12, 13], we can obtain the stationary distribution for T from the stationary distribu¬ 
tion of the de Bruijn-Bernoulli stationary distribution by adding the product distributions for each 
member of a congruence class. In our example 

I = (7r(a)^ -|- 27r(a)^7r(6), 7r(a)7r(6)^, 7r(a)^7r(6) -|- 7r(a)7r(5)^, 7r(a)7r(6)^, 7r(6)^) 

= (vr(a)^ -|- 7r(o)^7r(5),7r(a)7r(6)^,7r(a)7r(6),7r(a)7r(6)^,7r(6)^), 

where for the second line we used that 7r(a) + 7r(6) = 1. 

Recall that all elements in F{\A\,k) of length k are constant maps. We are interested in the 
probability that an element of length 1 < i < k is a, constant map when F(\A\,k) acts on right 
congruences. This is intuitively related to the hitting time (or waiting time) to constant map. As 
we will show in Section 6, there is a lattice structure imposed on the set of right congruences with 
partial order being inclusion. It turns out that we can approximate right congruences by special 
right congruences as introduced in Section 7 using certain meets and joins in this lattice. Special 
right congruences in turn are associated to semaphore codes as defined in Section 4, on which it is 
easy to compute the hitting time (see Section 8). The hitting time of the approximation (given by a 
semaphore code) and the right congruence turn out to be the same, and the approximation is finer 
than the right congruence. The stationary distributions of the two are simply related by “lumping”. 
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Let us now turn our attention to semaphore codes. For a fixed alphabet A, which we assume 
to be a finite non-empty set, denote by A~^ the set of all strings oi... of length i > 1 over A 
with multiplication given by concatenation. Thus {A'^^A) is the free semigroup with generators A 
(since every semigroup (S', •) generated by a subset ^ C S is a surmorphism of {A'^ ^ A) by mapping 
ai... —>■ oi • 02 • ... • € S). Furthermore, let A* = A'^ U {!}, so that A* is A'^ with the identity 

added; it is the free monoid generated by A. The semigroup A'^ has three orders: “is a suffix”, “is 
a prefix”, and “is a factor”. In particular, for u,v a A'^ 


M is a suffix of v 
u is a prefix of v 
u is a factor of v 


3w € A* such that wu = v, 

Bw G A* such that uw = v, 

3wi,W2 G a* such that wiuw 2 = v. 


A suffix code C of A'^ (or over A) is a subset C C A'^ so that all elements in C are pairwise 
incomparable in the suffix order [6]. 

A semaphore code [6] is a suffix code S over A for which there is a right action in the following 
sense: 


If u G S' C A'^ and a G A, then ua has a suffix in S (and hence a unique suffix of ua). 
The right action u.a is the suffix of ua that is in S. 


( 1 . 2 ) 


(The dual concept of prefix codes and left actions is often used in the literature, see for example [6]). 
For example, S = {ba^ | j > 0} =: ba* is an infinite semaphore code with right action 

ba^.a = ba^^^ and ba^.b = b. 


In practice, to check whether a suffix code is a semaphore code one merely needs to check the first 
line of (1.2). For example, C = {a, 66} is a suffix code, but a.b has no suffix in C, so that C is not a 
semaphore code. 

Semaphore codes over A are inherently related to ideals of A"*". A subset I C A"*" is an ideal if 
ulv C I for all u,v € A*. Similarly, L C A+ is a left ideal if uL C L for all u G A*. In this setting, 
suffix codes over A are precisely the suffix minimal elements of a left ideal L. 

Now given an ideal I C A"*" we construct a semaphore code as follows. Given u = aj ... 0201 G A+, 
check whether u is in I. If u ^ I, ignore u. If u G I, we find the (necessarily unique) index 1 < i < j 
such that Oi-i ... oi 0 /, but Oj... ai G /. Then o,... ai is a code word and the set of all such words 
forms the semaphore code S =: I/di, as can be readily verified. It is easy to show that 

I ^Ifdi 

is a bijection between ideals I C A+ and semaphore codes over A, see Proposition 4.3. Hence 
semaphore codes are precisely the suffix minimal elements of an ideal I C A"*". Since ideals are 
ubiquitous in mathematics, so are semaphore codes! 

As mentioned earlier, the set of right congruences RC(A*^) is a finite lattice under the inclusion 
order on the congruence classes, where the meet is given by intersection. We prove that RC(A^) is 
semimodular, but not modular in general, and thus satishes the Jordan-Dedekind condition that all 
maximal chains are of the same length. Also for |A| > 2 and k >2, RC(A*^) is not generated by its 
atoms. See Section 6 for more details. 
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Denote by Sem(^*^) the set of semaphore codes coming from ideals I D A^. This means that all 
codewords of Sem(A^) have length less than or equal to k (so the code is finite) and every member 
of has a suffix in the code. Starting with a semaphore code S and restricting the codewords 
of S to those of length < k, might not yield a finite semaphore code. But it is always possible to 
add codewords of length k to this length restricted semaphore code to obtain Sk € Sem(j4^). This 
process of adding codewords of length k which have no suffix in the restricted words is unique. For 
example, we have seen that S = ba* is a semaphore code. If we take fc = 3, we obtain {b,ba,ba^}. 
However, aaa has no suffix in this set, so it needs to be added to obtain the restricted semaphore 
code S 3 = {6, ba, baa, aaa}. In [22] we show that if S is a semaphore code, then the finite semaphore 
code Sk converges to S in some precise sense. 

Now each semaphore code S G Sem(H^) gives a right congruence p G RC(A^) as follows: 

For two strings u,v G A^, we say u u if u and v have a common suffix in S. (1-3) 

It is not too hard to verify that defines a right congruence on A^. For example, for A = {a,b} 

S = {aa, ab, aba, bba, abb, bbb} G Serna(H) 

yields the right congruence in RC(H^) 

{aaa, baa}, {aab, bab}, {aba}, {bba}, {abb}, {bbb}. (1.4) 

We denote all elements of RC(H^) that arise from semaphore codes in Sem(H*^) by SRC(H^), the 
special right congruences of RC(H^). We prove in Section 7 that SRC(H^) is a full (meaning that 
top and bottom agree) sublattice of RC(T^), so that each element p G RC(T^) has a unique largest 
lower (finer) approximation denoted by p, namely p is the join of all elements in SRC(H*^) contained 
in p. We will also prove in Section 7, and the reader can verify this, that the right congruence 
in (1.1) is not a special right congruence, but the special right congruence in (1.4) is the unique lower 
approximation. 

As for the de Bruijn graphs, we have random walks on semaphore codes since there is a right 
action of a semigroup on semaphore codes. If S' is a semaphore code over the alphabet A and 
TT: A —>• [0,1] is any probability distribution on A, namely Yha&A ~ then [6, Proposition 3.5.1] 

^7r(s) = 1, 
ses 

where 7r(s) = 7r(ai) • • • 7 r(a£} if s = ai... a£. This means in particular that S is a maximal code with 
respect to inclusion. 

We can now construct a random walk with state space given by the code words in S using the right 
action given in (1.2). Defining the jSj x jSj monomial matrix T(a) for each a € A by T{a)s,s.a = 1 
and 0 otherwise for all s G S, we obtain the transition matrix as 

T=Y,^{a)T{a). 

aeA 

We prove in Theorem 8.1 that the stationary distribution I of T is given by / = (7r(s))sg5'. Further¬ 
more, the probability that a word of length is a reset (or constant map) is 

-PW = X] 

ses 

i{s)<l 
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see Theorem 8.2. This probability is related to the hitting time to reset. For example, for the 
semaphore code S = ba*, all words w are resets unless w = a^. The probability that a string of 
length 3 is a reset is ^(3) = 7r(5) + 7r(6)7r(o) +7r(6)7r(a)^ = 1 — 7r(a)^. For more details see Section 8. 

We are now able to give a more direct construction of the special right congruence p for p G 
RC(A^), the best lower approximation of p in SRC(A^). Define 

Res(/9) = {w & I rc is a reset on /p}. 

Then we prove that Res(p) is an ideal of C and the special right congruence associated 
to the semaphore code given by this ideal is p. An immediate consequence is that p and p have 
the same hitting time to reset, but in general different stationary distributions. In general, p has 
more congruence classes than /?, so the stationary distributions cannot be the same. Note that both 
distributions are determined by lumping from the product distribution of the de Bruijn random 
walk on A^. In applications a metric is placed on all distributions of RC(A^). Then the probability 
distribution vr on A is chosen such that the distance between Ip and Ip is minimal. This is called the 
principle of choosing a “correct” or “good” probability distribution vr on A. 

The paper is organized as follows. In Section 2 we provide the algebraic background of the 
semigroups related to right congruences. The precise definition of resets is given in Section 3. 
Semaphore codes are introduced in Section 4. In Sections 5 right congruence and their properties 
are stndied, in particular the lattice structure in Section 6. Special right congruences are the subject 
of Section 7. Random walks on semaphore codes are studied in Section 8. Note that the semaphore 
codes introduced in Section 4 can be infinite. The analysis in terms of random walks in Section 8 is 
valid for both finite and infinite semaphore codes. In all other sections we restrict to finite lengths 
words and codes. 
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2 Algebraic foundations 

2.1 Elliptic maps on rooted trees 

Elliptic maps on finite trees were considered by Rhodes and Silva [17, 20]. A tree is a connected 
graph that does not contain a closed walk in which all vertices are distinct. A leaf of a tree is a 
vertex of degree 1, that is, a vertex that connects to exactly one edge. A rooted tree is a tree in which 
a particular node is designated as the root. In this case, if a vertex u is on the path from the root 
to another vertex n, we say that u is an ancestor of u, or equivalently, that u is a descendant of u. If 
u and V are adjacent, we say that u is the parent of v, which is the child of u. 
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Figure 2.1: Rooted tree r(2,3). 



Figure 2.2: Elliptic map ^p: Vert(T) ^ Vert(T) on T := T{2,3) which maps tq tq, vi V 2 , 

V 2 1-^ Vi, Uii 1-^ U21, U12 U21, V 13 1 -^ U 23 , V 21 U 12 , V 22 ^ Vll, V 23 ^n. 


Given a rooted tree T, we denote by Vert(T) the set of vertices of T. The distance between two 
vertices is the minimum number of edges in a path between them. An elliptic map on T is a mapping 
Vert(T) ^ Vert(T) preserving adjacency and distance to the root. Equivalently, an elliptic map on 
T is a contraction (decreases distances between vertices) while preserving distance to the root, or a 
mapping hxing the root and preserving parenthood. We shall write functions on the right since we 
will deal with right actions and compositions. Elliptic maps on a fixed rooted tree form a monoid 
under composition. 

Let T := T(no,... ,n]\f) be a uniformly branching rooted tree, where all leaves are at distance 
+ 1 from the root tq and each vertex at distance (or level) k from the root has rik children for 
A; = 0,..., A^. An example of a uniformly branching rooted tree is given in Figure 2.1. An example 
of an elliptic map on this tree is given in Figure 2.2. 

There is another way to represent an elliptic map (p using component actions. Namely, a given 
vertex v € Vert(T) at level k is completely specified by the unique path rg tci = u 

from the root. Since elliptic maps preserve parenthood, the image of this path under the elliptic map 
ro {wi)ip ^ • • • — {wk)p^ = {v)p> is again a path, this time from tq to {v)(p. Hence (p can be defined 
recursively: given the map from path tq —)■ tci —> • • • —>■ Wk-i to ro —>■ {wi)ip ^ • • • — {'Wk-i)p>, we 
can define a map Sw from the children of w := Wk-i to the children of {wk-i)'p. The map Sw is called 
the component action at vertex w. Graphically, we place s^ on the vertex w for every vertex w that 
is not a leaf. See Figure 2.4. The elliptic map of Figure 2.2 is written using component actions in 
Figure 2.3. 

As mentioned before, the product of elliptic maps is composition, which is another elliptic map. 
We can formulate this in terms of the component actions. Let ip and ■0 be elliptic maps on the same 
rooted tree T with component action Sy and at vertex v G Vert(T) that is not a leaf, respectively. 
Then the component action of (/? o at vertex v is where w is the parent of v. An example 

is given in Figure 2.5. 

Note that a child r of a vertex w can be uniquely specified by the edge e that leads to it. 
Hence the path cq = wq ^ wi ^ ^ Wk = v from rg to v can alternatively be encoded by a 

sequence eo ^ ei Ck-i of edges, where Ci is the edge from vertex Wi to tCj+i. For us, 
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Figure 2.3: Elliptic map of Figure 2.2 written with component actions: Srg is the map vi i—>■ V 2 , 
V 2 vi, is the map vn i-> U 21 , U 12 U 21 , U 13 1 -^ U 23 , and is the map U 21 i->- U 12 , V 22 vii, 

V23 wii- 



Figure 2.4: Component action at vertex w of an elliptic map on T(2,3,3). The component action 
Sw is a map on the children of w, namely on {ui, U 2 , U 3 }, and maps into the children of the image of 
w under the elliptic map. 

it will be convenient to keep track of the edges by labelling the edges leaving a given vertex 
at level 0 < i < N bijectively with elements from a set Xf with \Xi\ = ni. The result is a 
labelled rooted tree. See Figure 2.6 for an example. Note that there are lots of ways to label a 
rooted tree. Labelling the rooted tree is equivalent to specifying a coordinate system. Once the 
labelling L of T is fixed, a sequence cq ei —>■••• —>■ ek-i of edges is determined by an element 
(xo, xi,..., Xk-i) e Xo X Xi X ■ ■ ■ X X/c-i. 

Given a rooted tree T{no,... ,nAr) with labels in X = Xq x ■ ■ ■ x X^, elliptic maps can now 
be expressed using the labels giving rise to the wreath produet. The component action at level k 
is described by a semigroup acting faithfully on the right on X^, denoted (Xfc,5fc). Then the 
wreath product (Xq, Sq) o ■ ■ ■ o (X]\f, S^) is (X, S), where S is the semigroup with component action 
at level k in (X^, S^). More precisely, 11 = (IIo,..., IItv) E 5 if Ho € Sq, IIi: Xq —>■ Si, and generally 
Ilfc: Xq X • • • X Xfc_i —>■ Sfc for 1 < /c < X, so that for (xq, ..., xjv) E X 

(xo, . . . ,a:Ar)n = (^Xo.no,Xi.(xo)ni,X 2 .(xo,Xi)n 2 , . . . ,XAr.(xo, . . . ,XAr_l)nAr^ . ( 2 . 1 ) 

The semigroup element m := (xq, ..., Xk-ijTI/. E S^ is the component action in the vertex (or 
component) specified by (xq, ..., Xk-i). 

Remark 2.1 The above arguments show that elliptic maps on uniformly branching trees and wreath 
products are the same thing (confirming [20, Proposition 3.3]). 

Multiplication of wreath products is given by composition of the component action (2.1). Graph¬ 
ically on the level of labelled trees directly, the product II^ • II-^ for II^, E (X, S) translates to the 
following: 
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Figure 2.5: Composition or product of two elliptic maps on the rooted tree in Figure 2.1. 



Figure 2.6: Labelled rooted tree r(2,3,3) with labeling sets Xq = {1,2}, Xi = X 2 = {1,2,3}. 

1. To determine the value of II^ • Il-f at vertex x = (xq, ... ,Xk-i) in the labelled rooted tree, go 
to the corresponding vertex in the tree for 11^, keep track of all values at the vertices on the 
way and act with the corresponding elements on the vertex vector: 

X® = (^XQ.ng,xi.(xo)nf,x2.(xo,xi)nf,...,xfc.(xo,...,Xfc_i)n®^ . 

2. Then the entry in vertex (xq, •.., Xk-i) of II® • is (xq, • • •, Xfc_i)n^(xQ,..., x^ 

One of the main questions is “how restrained can the component action be”? See the hrst half 
of [18] and the introduction to [21]. 

The Prime Decomposition Theorem of Krohn and Rhodes [11] (see also [18] and [21, Chapter 
4]) states that every finite semigroup divides an iterated wreath product of its finite simple group 
divisors and copies of the three element aperiodic monoid U 2 consisting of two right zeroes and an 
identity. More precisely, a semigroup 5i divides semigroup S 2 , written 5i|52, if is a homomorphic 
image of a subsemigroup of 82 - In addition, U 2 = {l,a,b} where xa = a,xb = b, and lx = xl = x 
for all X € 1/2 • A finite semigroup is aperiodic if all of its subgroups are trivial. Alternatively, the 
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2 



Figure 2.7: Graphical presentation of an elliptic map with RZ component action using the same 
labeling as in Figure 2.6. The black leaf has coordinates (1, 2, 2). Since it passes the constant maps 
2,3,3 on its way, it gets mapped to the leaf with coordinates (2,3,3), denoted by the blue leaf. 

Prime Decomposition Theorem says that the basic building blocks of hnite semigroups are the finite 
simple groups and semigroups of constant maps with an adjoined identity. 

We say that / C S' is an ideal of the semigroup S if SI U/S C I. We write then I <S. The kernel 
of a semigroup S, denoted ker(S'), is the unique minimal nonempty ideal of S. If S' is a monoid, its 
group of units is the subgroup formed by all the invertible elements. Both kernel and group of units 
play a major role in this context. 

Let S'! and S '2 be semigroups and let y? be a homomorphism of S'! into endomorphisms of S' 2 . 
Then the semigroup S'! S 2 is the semidirect product of 5i by S 2 with connecting homomorphism 
If (see also [21, Section 1.2.2, pg. 23]). More precisely, 5i x^ S 2 has elements in S'! x S 2 with 
multiplication given by 

(si,S 2 ) ■ (51,4) = (sis'i,S2((5'i)(^) S 2 ) . 

Notice that wreath products are a special case of semidirect products. In fact, wreath products are 
“generic” semidirect products. Namely up to pseudovarieties, semidirect products, wreath products, 
and elliptic products yield the same thing. See [21] for all details. 

A semigroup S is called irreducible if for all hnite semigroups Si and S '2 and all connecting 
homomorphisms (/?, S j Si x^p S 2 implies SjSi or SjS 2 . Krohn and Rhodes [11] showed that S is 
irreducible if and only if either (a) S is a nontrivial simple group; or (b) S is one of the four divisors 

oft/2. 

A pseudovariety is a collection of hnite semigroups closed under taking hnite direct products and 
divisors (that is, subsemigroups and quotients) [21]. The monoid U 2 is in the pseudovariety RZ^, 
where RZ = [[xy = y]] is the pseudovariety of right zeroes, meaning that all elements x,y in S € RZ 
satisfy the identity xy = y. In other words, RZ is the pseudovariety generated by semigroups of 
constant maps. We denote by RZ^ the pseudovariety generated by semigroups of transformations 
consisting of constant maps plus the identity mapping. The elements in RZ^ are also called left 
regular bands, indeed RZ^ = [[x^ = x, xyx = yx]] (cf. [21, Proposition 7.3.2]). Random walks 
on left regular band are an important new topic [7, 8]. This has recently also been generalized to 
random walks on .^-trivial monoids [3, 4]. 

In light of the Prime Decomposition Theorem, there are three main cases for the component 
actions in Sk of the elliptic maps on T{nQ,... ,n]sf). All of the next three statements have the 
following form. First note that composition of elliptic maps on a fixed tree with component action in 
a fixed pseudovariety is closed under composition. Suppose that the component action Sk is selected 
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to be in the pseudovariety V. Then the pseudovariety generated by elliptic maps with component 
action in V (in this case divisors of elliptic maps) is determined and is denoted PV(component in V). 

It is the semigroups of PV(components in V) on which we analyze their random walks: 

1. S'fc is in the pseudovariety RZ with PV (component in RZ) which is delay semigroups (see 
Section 2.2). In this case the component action consists only of constant maps. If we label the 
branches from a vertex at level k by = {1, 2,..., n^}, then we can also label the vertices at 
level k by elements in Xj.. The label a € X^ means the constant map that maps everything to 
a. An example is given in Figure 2.7. 

2. Sk is in the pseudovariety RZ^ with PV(component in RZ^) which is aperiodic semigroups 
(which means semigroups with trivial subgroups). In this case the component action consists 
of constant maps and the identity; the component monoids are aperiodic. If again the branches 
at level k are labelled by Xk = {l,2,...,nfc}, then we can label the vertices by elements in 
Xk U {/}, where as before a € Xk denotes the constant map to a and I is the identity. 

3. Sk is any finite group plus constant maps and PV (component in any hnite group plus constant maps) 
is all finite semigroups. In this case the vertices at level k are labelled by elements in a finite 
group G which acts on the right on Xk and elements in Xk which give the constant maps. This 
yields a component semigroup with group of units in G and kernel in RZ. 

In this paper we will restrict to elliptic maps or wreath products with component actions in RZ, 
that is constant maps (without identity) to answer the question about resets. Future papers will 
deal with cases 2 and 3. 

2.2 Delay pseudovariety 

Let D be the pseudovariety of semigroups whose idempotents are right zeroes, also called the delay 
pseudovariety. The pseudovariety D can be characterized (see [21, pg. 248]) by 

D = U Dfc, 

fc>i 


where 

Dfc = [[a:o2:i • • • Xfc = xi • • • Xk]] , 

meaning that any k + 1 elements xq, ... ,Xk in a semigroup S G satisfy the identity xqXi 
X l ■■■Xk- 

The delay pseudovariety is also equal to RZ^ defined as 

RZ'^ = {S I 5/ker(S') is nilpotent and ker(S') G RZ} , 

where we recall that RZ = [[xy = y]]- A semigroup N with zero is nilpotent if = {0} for some 
k, or in other words, xi ■ ■ ■ Xfc = 0 in V. Thus, S' G D if and only if S satisfies the pseudoidentity 
xy‘^ = y‘^, where y‘^ is the unique idempotent in {y) < S, or more succinctly 

D = [[xy^ = y‘^]] = RZ^ . 

The pseudovariety D is also closed under semidirect products. For all details see [21]. 


( 2 . 2 ) 

■■■Xk = 
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A semigroup S' is a subdirect product of Si and S 2 , denoted S -C Si x S 2 , if S is a subsemigroup of 
Si X S 2 mapping onto both Si and S 2 via the projections [21, pg. 34], More concretely, S <C Si x S 2 
if and only if there exist surmorphisms : S —)■ Sj for i = 1,2, so that ipi and ip 2 separate points, 
that is, s,t G S with s / t implies that {s)ipj / {t)fj for some j € {1,2}. The right letter mapping 
congruence on a semigroup S G D is defined hy s ^ t if zs = zt for all z G ker(S), that is, we identify 
two elements of S if they act the same on the right of ker(S). Therefore ~ is the kernel of the right 
Schiitzenberger representation of S on ker(S). We denote by RLM: S ^ S the canonical morphism 
s s/ ~, and denote its image by RLM(S). (This definition agrees with the definition given in [21, 
Section 4.6.2]). 

From this it now follows that if S G D = RZ^, then 

S < S/ker(S) x RLM(S). 

This can be observed by letting (^ 1 : S —>■ S/ker(S) be the Rees quotient map, which maps s s if 
s ^ ker(S) and collapses ker(5) to a single element. Let <y92: 5 —)■ RLM(S') be the map s i-)- s/ ~. 
Hence (p2 is injective on ker(5), so that ipi and ip2 separate points. In our applications, we only 

care about RLM(S'). Note that a semigroup S' G D is nilpotent if and only if RLM(S) is the trivial 

semigroup (0). 

Observe that for S, T G D we have ker(S), ker(T) G RZ and 

if S ^ T then RLM(S) ^ RLM(r) 

if S ^ T then ker(S) ^ ker(r) (2-3) 

RLM(RLM(S)) ^ RLM(S). 

The proofs are not difficult and all details can be found in [21, Section 4.6.2]. 

Definition 2.2 An equivalence relation r on ker(S) is called a right congruence if it preserves the 
right action of S on ker(S), that is, if ztz' implies {zs)t{z's) for all z,z' G ker(S) and s G S. We 
denote by RC(ker(S), S) (or by RC(ker(S)) if S is implicit) the set of all right congruences on ker(S). 

We consider RC(ker(S)) (partially) ordered by inclusion. Since the intersection of right con¬ 
gruences on ker(S) is still a right congruence, (RC(ker(S)), C) is a (complete) A-semilattice. Thus 
(RC(ker(S)), C) is indeed a (complete) lattice with the determined join, described by 

VA = P ){/0 G RC(ker(S)) | A C p for every A G A} 

for every A C RC(ker(S)). 

It is routine to check each r G RC(ker(S),S) determines a congruence r on (ker(S), RLM(S)) 
defined by 

{s ~)r(t ) if {zs)T{zt) for every 2 : G ker(S), 

where s ~ denotes the equivalence class of s G S under the right letter mapping congruence ~. Since 
S G D, we have ker(S) G RZ, and it follows easily that 

ztz' if and only if [z ^)t(z' ~) holds for all z, / G ker(5). (2.4) 

Thus right congruences on ker(S') and right letter mapping images of S are the “same thing”. 


12 



Figure 2.8: Generators for F(3,3) on T(3,3,3) with = {a, 6, c}. 

2.3 Right zero component action 

In this section, we specialize the elliptic maps on rooted uniformly branching trees of Section 2.1 to 
the constant component action. That is, we restrict ourselves to the case that the component action 
Si e'RZ = [[xy = y]] for all 0 < £ < iV. 

Let F{g, k) be the semigroup generated by Ag := {oi, 02 ,..., Ug} modulo all relations of the form 

for iQ,... ,ik £ {!) • • •)S'}- This semigroup admits a convenient normal form: we can identify F(g, k) 
with A-^ \ {e}, the set of all nonempty words on A of length at most k (we denote the empty word 
by e). Note that we may define length of an element of F{g, k) as the length of the respective normal 
form in A-^ \ {e}. 

Given u € A'^, let denote the suffix of length /c of u if |u| > k and u otherwise. We define a 
binary operation o on A-^ \ {e} by 

uo V = {uv)^k- 

This binary operation on the normal forms corresponds to the product of F{g,k). For example in 
F{2, 3) with A 2 = {a, b} we have aba ■ a = baa, aba ■ bbb = bbb, b ■ a = ba and so on. 

It is immediate that F{g,k) satisfies the identity 

xqXi • •-Xfc = xi ■ ■-Xfc. ( 2 . 5 ) 

Indeed, F{g,k) is the free pro-Ok semigroup over A (see [21, Subsection 3.2.2] for details on free 
pro-V semigroups, for a pseudovariety V). Since F{g,k) is finite, it follows that F{g,k) € D. Note 
that we can identify k.eT{F{g, k)) with A^, the set of all words on A of length k. 

It can also be interpreted in terms of elliptic maps on T :=T{g,... ,g) as follows. As in Section 2.1, 

k 

we represent elliptic maps directly on the tree by denoting the component action on the vertices. 
Define the generators ipi,... ,ipg through trees of depth k with g branches at each level, where in 
level 1 < £ < fc the vertices are labeled ai,...,ag from left to right. The i-th generator has label 
a* at level 0. Since the vertices at level k are not labeled, we will omit them for space reasons. An 
example of the generators for F{3,3) is given in Figure 2.8. 
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A-B = 



Figure 2.9: Multiplication of elements A and B in F{3, 3). Note that the first two levels are constant 
precisely as specihed by A and B. 

A label a* in a given vertex denotes the constant map to ai. If we label the edges under each 
vertex also ai,... ,ag from left to right, then we can multiply generators on the labeled tree as in 
Section 2.1. See Figure 2.9 for the product of A and B of Figure 2.8. Using the notation to 

denote the nodes below the root as in Subsection 2.1, we have Vj^,„j^(pi = and so 

■ ■ ■ ‘fio — '^io-i£-ljl-jk-£ 

for every £ < k. In terms of component actions, this translates into a tree with on level 0, on 
all g vertices of level 1, and in general ai. on all vertices of level j for 0 < j < £. It follows easily 
from 

■ ■ ■ fio = ■ ■ ■ ^io 

that ifi,... ,ipg generate a semigroup isomorphic to F{g, k). 

This gives a simple proof of Stiffler’s Theorem [23] (see also [21, Theorem 4.5.7, pg. 248]). 

Theorem 2.3 (Stiffler) The smallest pseudovariety eontaining the 2-element right zero semigroup 
that is closed under semidirect product (equivalently wreath or elliptic products) is D. 

Proof. As discussed in Section 2.2, D is a pseudovariety that is closed under semidirect product. 
By the arguments above, the free objects F{g, k) are elliptic products with component action in RZ 
and since every member of D is a suromorphic image of an appropriate free one, the theorem is 
proved. □ 

In the sequel, we will be interested in the classihcation of right congruences on \Lei{F{g, k)) € RZ. 


3 k-reset graphs 

k-ieset graphs are finite state automata [18] with the additional property that strings of length k 
are resets or constant maps. The formalism is such that the definitions in the profinite case, when k 
tends to infinity, is very similar. Let us now discuss the details. 

Let A be a finite nonempty alphabet. An A-graph is a structure of the form F = {Q,E), where: 

• Q is a hnite nonempty set (vertex set); 

• E C Q X A X Q (edge set). 

A nontrivial path in an A-graph F = (Q,E) is a finite sequence of the form 

Cl\ 0,2 0,n 

qo — >qi —t- >qn 
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such that Oj, qi) € E for i = 1,... ,n. Its label is the word 0102 ■ ■ ■ an € A'^ = A* \ {e}, where 

A* is the set of words in the alphabet A and e is the empty word. A trivial path is a formal expression 
of the form 


e 



An A-graph T = {Q, E) is: 


• deterministic if 

ip, a, q), {P, a,q') € E ^ q = q' 
holds for all p, q,q' € Q and a £ A; 


• complete if 


'ip £ Q ya £ A 3q £ Q : {p, a, q) £ E] 

• strongly connected if, for all p,q £ Q, there exists a path p-^q in F for some u £ A*. 
If r = (Q, E) is deterministic and complete, then E induces a function 


y. A —^ Q 
{q, a) I—)• qa 

dehned by {q, a, qa) £ E. Conversely, every such function dehnes a deterministic complete A-graph. 
Moreover, we can extend the function Q x A —Q to a function Q x A* —> Q as follows: given q £ Q 
and u £ A*, qu is the unique vertex such that there exists a path 

U 


in r. This function is called the transition function of F. 

Let F = {Q, E) and F' = {Q', E') be A-graphs. A morphism y? : F —>■ F' is a function ip : Q ^ Q' 
such that 

{p, a,q) £ E ^ {pip, a, qip) £ E'. 

If ip is bijective and ip~'^ is also a morphism, we say that ip is an isomorphism. In this case we write 

P ~ p/ 

Given A-graphs F,F', we write F < F' if there exists a morphism F ^ F'. This is clearly a 
reflexive and transitive relation, hence a preorder on the class of all A-graphs. Technically, this is 
not a partial order, but we have the following remark: 

Lemma 3.1 Let A he a finite nonempty alphabet and let F,F' be strongly connected deterministic 
complete A-graphs such that F < F' < F. Then F = F'. 

Proof. Let </? : F —> F' and : F' —> F be morphisms. Write F = {Q,E) and F' = {Q',E'). Fix 

some qo £ Q and take q' £ Q'. Since F' is strongly connected, there exists some path qQip — >q' in F' 
for some u £ A*. Since F is complete, there exists some path qo-^q in F for some q £ Q. It follows 
from ip being a morphism that there exists a path q^ip-^qip in F'. Since F' is deterministic, we get 
q' = qip, hence ip is onto and so \Q'\ < |Q|. By symmetry, we get \Q'\ = \Q\, thus ip is bijective. 

It remains to be proved that ip~^ is a morphism. Assume that {pip, a, qip) £ E' for some p,q £ Q 
and a £ A. Since F is complete, there exists some {p, a, r) £ E. Since is a morphism, we get 

{pip, a, rip) £ E'. Now F' being deterministic yields qip = rip, and so q = r since ip is bijective. 

Therefore {p, a,q) £ E and so ip~^ is a morphism as required. □ 
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We say that u a A* is a reset word for the deterministic and complete ^-graph F = {Q,E) if 
\Qu\ = 1. This is equivalent to say that all paths labeled by u end at the same vertex. Let Res(r) 
denote the set of all reset words for F. For every /c € N, let 

Resfc(F) = Res(F) n 

We say that F is a k-reset graph if Resfc(F) = A^. We denote by RGfc(^) the class of all strongly 
connected deterministic complete k-reset ^-graphs. 

Given F G RGfc(74), let [F] denote the isomorphism class of F. Let 

RGk{A)/^ = {[T]\T eRGkiA)}. 

Given F,F' G RGfc(^), write 

[F] < [F'] if F < F'. 

R is immediate that < is a well-defined preorder on RGfc(^)/ =. Moreover, it follows from Lemma 3.1 
that: 

Corollary 3.2 Let A be a finite nonempty alphabet and let k > 1. Then < is a partial order on 
RGk{A)/ 


4 Semaphore codes 

A detailed discussion on semaphore codes can be found in [6, Chapter 3.4]. 

Let A be a finite alphabet. We define three partial orders on A* by 

• u <p V if V G uA*, 

• u <s u if u G A*u, 

• u <f V it V G A*uA*. 

We refer to them as the prefix order, the suffix order and the factor order on A*. 

If A C A* is a nonempty antichain with respect to <p (respectively <s, </), it is said to be 
a prefix code (respectively suffix code, infix code). Note that our notions differ slightly from the 
standard notions since we admit {e} to be a code of all three types! 

Given an ideal I ^ A*, let If denote the subset of elements of I wich are minimal with respect 
to </. Then I = A*{I/3)A* and If QB whenever RCA* satisfies I = A*BA*. We say that If is 
the basis of I. Clearly, the correspondences 

C ^ A*CA* 

establish mutually inverse bijections between the set of all ideals of A* and the set of all infix codes 
on A. 

We say that L C A* is a left ideal it L $ and A*L C L. We write then L A*. Given L A*, 
let Lfii denote the subset of elements of L wich are minimal with respect to <s. Then L = A*{Lfi£) 
and Lfi C B whenever B T A* satisfies L = A*B. We say that Lfii is the left basis of L. Clearly, 
the correspondences 

L ^ Lfii, S ^ A*S 
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establish mutually inverse bijections between the set of all left ideals of A* and the set of all suffix 
codes on A. 

Similarly, ii C ^4* is a right ideal if i? 7 ^ 0 and RA* C R. We write then R<r A*- 
We relate now ideals to semaphore codes. The dehnition we use is actually the left-right dual of 
the classical dehnition in [ 6 , Section 3.5], but we shall call them semaphores codes for simplihcation. 
We also admit 0 and {e} as (semaphore) codes, but this generalization is compatible with the relevant 
results from [ 6 ]. 

A semaphore code on the alphabet A is a language of the form 

AA* \ A+XA*, 

for some X C A*. If A 7 ^ 0, then AA* \ A'*'AA* is a maximal suffix code (with respect to inclusion) 
by [ 6 , Proposition 3.5.1]. Now [ 6 , Proposition 3.5.4] provides an alternative characterization of 
semaphore codes: 

Lemma 4.1 [ 6 , Proposition 3.5.4] For every S C A*, the following conditions are equivalent: 

(i) S is a semaphore code; 


(a) S is a suffix code and SA C A*S. 

Let Sem(A) denote the set of all semaphore codes on the alphabet A. We dehne a partial order 
< on Sem(A) hy S < S' if A*S < A*S'. 

Example 4.2 Let A = {a,b} and X = {b}. Then the semaphore code is infinite 

S = AA* \ A+AA* = {6, ba, 60^, 6a^ ...} = ba*. 

If on the other hand A = {a, b} and X = {a^, aft, ft^}, then the semaphore code is finite 

S = AA* \ A+AA* = {a^, aft, ft^, afta, ft^a}. 


We denote by T{A) (respectively C{A),TZ{A)) the set of all ideals (respectively left ideals, right 
ideals) of A*. If we order X(A) (or C{A) or IZ{A)) by inclusion, we get a complete (distributive) 
lattice where meet and join are given by intersection and union. The top element is A* and the 
bottom element is 0. We can now prove the following. 

Proposition 4.3 Let A be a finite nonempty alphabet. Then 


<!>: (Z(A),C)^(Sem(A),<) 

I^I^i 


and 


T: (Sem(A) <)^(2:(A),C) 

Se^A*S 


are mutually inverse lattice isomorphisms. 

Proof. Let I G T{A). Then 1/3^ is clearly a suffix code. Since {I(5()A Q I = A*{If3i), then 
II3i ^ Sem(A) by Lemma 4.1 and is well-dehned. 

On the other hand, given S G Sem(A), it is clear that A*S'<£ A*. Now S'A C A*S by Lemma 4.1, 
hence A*S' is actually an ideal of A* and so T is also well-dehned. 

Now /‘hT = A*{If3i) = I and = {A*S)j3i = S follows easily from S being a suffix code, 

hence 4> and T are mutually inverse bijections. Since S < S' \i and only if 5T C S'"^ holds for all 
S,S' G Sem(A), and T are actually mutually inverse poset isomorphisms. Since (X(A),C) is a 
lattice, so is (Sem(A), <) and so $ and T are lattice isomorphisms. □ 


As we will see in Section 7, semaphore codes are related to special right congruences. 
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5 Right congruences on the minimal ideal of F{g, k) 

Now fix a nonempty alphabet A = {oi,..., a^} and a positive integer k. We remarked in Subsec¬ 
tion 2.3 that A-^ \ {e} is a set of normal forms for F{g,k), the free pro-D^ semigroup on the set 
A = {ai,..., Ug}. Moreover, we can identify A^ with keT{F{g, k)). Since F{g, k) is generated by A, 
right congruences on A^ can be described as equivalence relations p satisfying 

upv => (n o a)p{v o a) 


for every a € A, or equivalently, 

upv => {{ua)^k)p{{va)Ck) 

for every a d A. 

Given R C A^ x A^, we denote by the right congruence on A^ generated by R, i.e. the 
intersection of all right congruences on A^ containing R. Let u,v G A^. Then {u,v) G R'^ if and only 
if there exists some finite sequence wq, ... ,Wn ^ A^ {n > 0) such that: 

• wq = u and Wn = v, 

• for every i = 1,... ,n, there exist (n, Si) G R and Xi G A* such that {wi-i,Wi} = {rjoxj, SiOXi}. 
It is easy to see that 

VA = (UA)** 

for every A C RC(A*^). 

We now relate right congruences on A^ with the fc-reset graphs introduced in Section 3. 

Given p G RG(A^), the Cayley graph of p is the A-graph Cay(/9) = {A^/p,E) dehned by 

E = {{up, a, {u o a)p) \ u G A^, a G A}, 

where up denotes the congruence class of u. In particular, if p is the identity relation, then C&y{p) 
is a /c-dimensional De Bruijn graph on |A| symbols. 

Given T = {Q,E) G RGfc(A), let Cr be the equivalence relation on A^ defined by 

uQyv if Qu = Qv. 


Note that 

Q{{ua)^k) = Qua (5.1) 

holds for all u ^ A^ and a G A. Indeed, since Qua C Q{{ua)^k) and {ua)^k is a reset word, we must 
have equality and (5.1) holds. 

Proposition 5.1 Let A be a finite nonempty alphabet and k>l. Then 

ch: (RG(A^),Q^(RGfe(A)/- <) T: (RGfc(A)/-, <) ^ (RG(A^), C) 

p^[Gay(p)] [r]^Cr 


are mutually inverse lattice isomorphisms. 
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Proof. Let p G RC(74^). It follows from the definition that Cay(/9) is deterministic and complete. 
For all n, u € , we have uo v = v, hence there exists a path 

up^^{u o v)p = vp 

in Cay(p). It follows that C&y{p) is strongly connected and C Resfc(Cay(p)), thus Cay(/o) G 
RGk(A) and is well-defined. 

On the other hand, it is clear that [F]'!' does not depend on the chosen representative for the 
isomorphism class [F]. 

Let F G RGfc(^). Let (u,v) G Cr and a G A. Then Qu = Qv implies Qua = Qva and therefore 
{u o a,v o a) G Cr in view of (5.1). Thus Cr G RC(A^) and so T is well-dehned. 

Let p G RC(^^) and write p' = Ccay(p)' Q — /P is the vertex set of Cay(p), then Qu = {up} 

for every u G A^. Hence 

up'v Qu = Qv 4^ up = vp 

and so ‘h'h = 1. 

Conversely, let F = {Q,E) G RGfc(74) and let F' = Cay(Cr)- We show that 

Vg G Q 3uq G A^ : Quq = {g}. (5.2) 

We may assume that \Q\ > 1. Since F is strongly connected, it follows that there exists a loop q-^q 
in F with w ^ e. Replacing tc by a proper power if necessary, we may assume that \w\ > k. Hence 
there exists some Uq G A^ such that q G Quq. Since Uq is necessarily a reset word, we get Quq = {(?} 
and so (5.2) holds. 

We define a mapping 

0 : g^^VCr. 

q ^ UqCr 

Note that 

Qu = Qv 44 rtCr = v(r (5.3) 

holds for all u,v G A^, hence 9 is well-defined and one-to-one. Since F is a fe-reset graph, 9 is also 
onto. We show that 9 is an isomorphism from F onto Cay(Cr)- 
Assume that {p,a,q) G E. By (5.1), we get 

Q{up o a) = Qupa = pa = q = Quq. 

Hence UgCr = {up o a)Cr and so there exists an edge Up(r—^Uq(r in Cay(Cr)- 

Conversely, assume that Up(r—^Uq(r is an edge of Cay(Cr)- Then Uq(r = (up o a)Cr and so 

q = Quq = Q{up o a) = Qupa = pa 

by (5.3) and (5.1). Thus {p,a,q) G E and so 0: F ^ Cay(Cr) is an isomorphism. Therefore = 1 
and so $ and T are mutually inverse bijections. 

Let p,p' G RC(A^) with p ^ p' ■ Then 


9 : A^ j p^ A^ / p' 

up 1 -^ up' 
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is a well-defined surjective map. If up-^{u o a)p is an edge of Cay(/3), then up'-^{u o a)p' is an 
edge of Cay(/o)', hence 0 is a morphism from Cay(p) to Cay(p') and so Cay(p) < Cay(p'). Thus 
[Cay(/9)] < [Cay(/o')] and so <I> is order-preserving. 

Let r,r' G RGfc(A) be such that [T] < [T']. Then there exists a morphism 0: T —>■ T'. Write 
T = {Q,E) and T' = {Q',E'). Suppose that {u,v) G Cr- Then Qu = Qv = {g} for some q & Q. 
Hence q9 G Q'u PI Q'v. Since T' is a A:-reset graph, we get Q'u = {qO} = Q'v and so {u,v) G Cr'- 
Therefore T is order-preserving. 

Since and 'L are mutually inverse order-preserving mappings, they are isomorphisms of posets. 
Since (RC(H*^),C) is a lattice, then (RGfc(H), <) is also a lattice, and and T are mutually inverse 
lattice isomorphisms. □ 


6 Lattice-theoretic properties 

We discuss in this section the lattice-theoretic properties of the lattice RC(H^). 

We recall some well-known notions from lattice theory. Let L be a (finite) lattice with bottom 
element B and top element T. Given a,b & L, we say that b covers a if a < 6 and there is no c G L 
such that a < c < b. If a covers the bottom B, we say that a is an atom. 

The lattice L is said to be: 

• modular if it has no sublattice of the form 



e 


• semimodular if it has no sublattice of the form (6.1) with d covering e; 

• atomistic if every element of L is a join of atoms {B being the join of the empty set). 
Proposition 6.1 Let A be a nonempty set and k > 1. Then RC(H^) is semimodular. 
Proof. It suffices to show that RG(H^) has no sublattice of the form 


P 



A 
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with r covering A in RC(j4^). 

Suppose it does. Given x,y € A*, let lcs(x,y) denote the longest common suffix of x and y. If 
x,y € are distinct, then |lcs(x,?/)| < k and so 

|lcs(x o a, y o a)| > |lcs(x, y)| (6-2) 

for every a d A. 

Let {u,v) G r \ A with |lcs(tt,u)| maximal. For every a € A, we have 

{u o a,v o a) G {(u, u)}** C r. 

In view of (6.2), and by maximality of |lcs(u, u)|, we get 

(u o a, V o a) G A. (6.3) 

Note also that 

A C (A U {(u, u)})** C r 

yields 

r = (A U {(u, u)})^ (6.4) 

since r covers A. 

Let (y, z) G a' \ a. Then (6.4) yields 

(y, z) G p = {a V t) = {a U {Xu {(u, u)})**)“ = (u U {(u, u)})** 
and so there exists some finite sequence wq, ... ,Wn G A^ such that: 

• wq = y and Wn = z] 

• for every i = l,...,n, there exist {ri,Si) G aU {(u,u)} and Xi G A* such that {wi-i,Wi} = 

{n 0Xi,Si oxi}. 

Now by (6.3) we may assume that Xi = e whenever {ri,Si) = {u,v). Since we may assume that the 
Wi are all distinct, the relation {u, v) is used at most once, indeed exactly once since (y, z) ^ a and 
(rj, Sj) G a implies (r^ oxj, Sj oxi) G a. We may assume without loss of generality that u = Wj-i and 
V = Wj for some j G {1,... , n}. Hence 


y = Wq a Wj-i = U, V = Wj a Wn = z 


and so 


u = Wj-i a' y a' z a' Wj = v. 


It follows that A U {(u,u)} C a'. By (6.4), we get r C a', a contradiction, 
semimodular. □ 


Therefore RC(H^) is 


Since a semimodular lattice of finite height (i.e. the length of chains is bounded) satisfies the 
Jordan-Dedekind condition (i.e. all maximal chains have the same length), we immediately obtain: 

Corollary 6.2 Let A be a nonempty set and k > 1. Then RC(H*^) satisfies the Jordan-Dedekind 
condition. 

We show next that we cannot replace semimodular by modular in Proposition 6.1. 

Proposition 6.3 Let k > 1 and let A be a set with |H| > 4. Then RC(H^) is not modular. 
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Proof. Let a,b,c,d G A be distinct. Let A be the identity relation on and let 

cJ = AU{a^6a*^-l}2; 

cj' = A U {a^, U 

r = A U {a^, da^~^}‘^ U {ba^~^ , ca^~^}‘^; 
p = Xu {a^, ba^~^ , ca^~^ , da’‘~^}‘^ . 

It is routine to check that all the above relations are right congruences on A^. Moreover, 

X C a C a' C p, X C T C p, 
a' 0 T = X, {a y t) = p, 

hence 



is a sublattice of RC(74*') and so RC(A^) is not modular. □ 


We can also show that RC(A^) can only be atomistic in trivial cases: 

Proposition 6.4 Let k > 2 and let A be a set with |A| > 2. Then RC(A^) is not atomistic. 

Proof. Let A be the identity relation on A^. Let a,b G A he distinct and let 
CT = A U {a^ b'^a^-^, ba^-^}^ U {a^-\ba'^-^}^; 

T = A U {a^ 6a^-^}2 U {a'^-\ ba’^-H}^. 

It is routine to check that a,T G RC(^*^). Moreover, A C r C a. We show that 

a = {(xa^-i,6V-2)}« (6.5) 

for every x G {a, b}. Indeed, let p = {{xa^~^,b‘^a^~‘^)}'^ . Then G p yields (a^, ba^~^) G 

p and so {a^,b‘^a^~‘^,ba^~^}‘^ C p. Finally, {xa^~^,b'^a^~‘^) G p yields {a^~^b,ba'^~‘^b) G p and so 

a C {(xa^-\62a^-2)}#. 

Since {xa^~^,b‘^a^~‘^) G a ioi x G {a,b}, (6.5) holds. 

Now we claim that r is the unique element of RC(^*') covered by a. Indeed, assume that p <Z a. 
In view of (6.5), we have {a^,b‘^a^~‘^) ^ p and {ba^~^,b‘^a^~‘^) ^ p. Hence p Ut. Since a is not an 
atom, it follows that 

a < (T if and only if a < r 

for every atom a of RC(A^). Thus a cannot be expressed as a join of atoms and so RC(^*^) is not 
atomistic. □ 
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7 Special right congruences on 

To avoid trivial cases, we assume throughout this section that ^ is a finite alphabet containing at 
least two elements. We define 

Xk{A) = {I<A* I A^dl), 

Ck{A) = {L <tA*\A^^L}. 

If we order Xk{A) (or Ck{A)) by inclusion, we get a finite (distributive) lattice where meet and join 
are given by 

(/ A J) = / n J, (/ V J) = / U J. 

The top element is A* and the bottom element is A^A*. 

Given L € Ck{A), we define a relation tl on A^ by: 

utlv if u and v have a common suffix in L. 

Lemma 7.1 Let L € Ck{A). Then tl is an equivalence relation on A^. 

Proof. It is immediate that tl is symmetric. Since A^ C L, it is reflexive. Assume now that 
u,v,w G A^ and x,y ^ L are such that x <s u,v and y <s v, w. Since x and y are both suffixes of v, 
one of them is a suffix of the other. Hence either x <s u,w or y <s u, w. Therefore tl is transitive. 
□ 


Being a right congruence turns out to be a special case: 

Proposition 7.2 Let L G Ck{A). Then the following eonditions are equivalent: 

(i) tl G RC(A^); 

(a) L G Xk{A); 

(Hi) {Ll3i)A C A*{L(3i); 

(iv) LPi is a semaphore eode. 

Proof, (i) => (iii). Let u G Lj3i and a G A. Since A*{Lj5() = L D A^, we may assume that 
\u\ < k — 1. Let b € A \ {a} and write m = k — \u\. Then b'^u) G tl, hence 

{a^~^ua, b"^~^ua) = {a^u o a, b'^u o a) G Ti¬ 
lt follows that a"^~^ua and b'^~^ua must share a suffix in L, and so ua itself must have a suffix in 
L. Thus 

(L/3£)A C A*L = L = A*{Lj3e). 

(iii) ^ (ii). We have 

LA = A*{LPe)A C A*{LPt) = L. 

It follows that LA* C L. Since L G Ck{A), we get L G Xk(A). 

(ii) (i). By Lemma 7.1, tl is an equivalence relation. Let u,v ^ A^ be such that utlv. Then 
w <s u,v for some w a L. We may assume that \w\ < k. Let a G A. Since L <A* , we have wa G L. 
Since |t(;| < k, it follows that wa is a common suffix of n o a and v oa. Therefore (u o a)TL{v o a) and 
we are done. 

(iii) (iv). This follows from Lemma 4.1, since Ljdi is always a suffix code. □ 


23 


Note that we can easily produce examples of L € Ck{A) \ Zk{A): 

Example 7.3 Let A = {a, b}, k = 3 and L = A*b U A'^Aa. Then L € Ck{A) but tl ^ RC(74^). 

Indeed, b ^ L but ba ^ L, hence L ^ Tk{A) and so tl ^ RC(74^) by Proposition 7.2. Note that 
in this case Pi = {b, a^, ba?,aba, b^a}. 

Inclusion among left ideals determines inclusion for the equivalence relations tl'- 
Lemma 7.4 Let |^| > 1 and L,L' € Ck{A). Then 


Tl Q tli L T L'. 

Proof. Assume that L Q L'. Let (u, v) & tl- Then u and v share a common suffix in L and therefore 
in L'. Thus {u,v) G Tl'- 

Assume now that L % L'. Let w ^ L\L' have minimum length. Since A^ C L', we have \w\ < k. 
Let n = k — \w\. Fix a,b £ A distinct and take {u, v) = {aTw, h^w) £ A^ x A^. Since w £ L, we have 
(u, v) £ Tl- Now w is the longest common suffix of u and v. Since w ^ L', it follows that {u, v) ^ tl'. 
□ 


Note that Lemma 7.4 does not hold for |A| = 1, since \A^\ = 1. 

Definition 7.5 We say that p £ RC(A^) is a special right congruence on A^ if p = tj for some 
I £ Xk{A). In view of Proposition 1.2, this is equivalent to say that p = ta*s for some semaphore 
code S on A such that A^ C A*S. We denote by SRC(A^) the set of all special right congruences on 
A^. 

Note that not every semaphore code S satisfies the condition A^ C A*S. However, it is easy to 
derive a semaphore code from S that does by considering 

S' = {SnA^^)U{A^\A*S). (7.1) 


S' is a suffix code since the elements in S' n A-^ are incomparable in suffix order since S is a suffix 
code, and by construction any element in A^ \ A*S is incomparable with the elements in S H A-^ 
and vice versa. Furthermore, A^ C A*S' A A*S and SA C A*S by Lemma 4.1. Thus S'A C A*S' 
and so by Lemma 4.1 S' is a semaphore code. 

Proposition 7.6 Let |A| > 1. Then: 

(i) Tinj = T/ n Tj and r/uj = ti U tj for all I,J £ Ik{A); 

(ii) SRC(A^) is a full sublattice o/RC(A^); 

(Hi) the mapping 

Ik{A)^SRC{A^) 

I I 

is a lattice isomorphism. 


Proof, (i) By Lemma 7.4, we have r/nj ^ tj (Itj and r/ U rj C t/uj- 

Let {u,v) £ T[ n Tj. Then there exist x £ I and y £ J such that x <s u,v and y <« u,v. Since 
X and y are both suffixes of the same word, one of them is a suffix of the other, say x <s y. Then 
y G / n J and so {u, v) £ Tjnj- Thus r/nj = t/ n tj. 
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Assume now that {u,v) € r/uJ- Then there exists some x € lU J such that x <s u,v. If x G /, 
then {u,v) G t/, otherwise {u,v) G tj. Therefore t/uj = t/ U tj. 

(ii) Let I,J€^ Zk{A). By part (i), r/nj is the meet of tj and tj in both RC(A^) and SRC(A^). 
And T/uj is the join of tj and rj in both RC(A^) and SRC(A*^). 

Finally, is the identity relation and therefore the bottom element of both lattices. And 

is the universal relation and therefore the top element of both lattices. 

(iii) This follows from Lemma 7.4. □ 

Given p G RC(A*') and C € /p, we denote by lcs(C') the longest common suffix of all words in 

C. We define 


Ap = {lcs(C') I C G A^/p} and A^ = {lcs(ti,u) | {u,v) G p}. (7.2) 

Lemma 7.7 Let p G RC(A'=). Then A*Kp = A*A'p€ Tk{A). 

Proof. Let C G A^/p and let w = lcs(C'). If |r(;| = k, then w = lcs{w,w). If |rc| < k, then 
by maximality of w there exist a,b € A distinct and u,v G A* such that uaw, vhw G C. Thus 
w = \cs{uaw^ vbw) and so 

Ap C A'p. (7.3) 

Therefore A*Ap C A*A'p. 

Conversely, let {u,v) G p. Then lcs(ttp) is a suffix of lcs(rt, u), hence A^ C A*Ap and so A*Ap = 
A*A'p. 

Clearly, A*A'p A* . Since u = lcs(ri, u) for every u G A^, we have A^ C A),. Hence it suffices to 
show that {A'p)A C A*A'p. 

Let (m, u) G p and a ^ A. We must show that (lcs(ri, u))a G A*A'p. Since A^ C A^, we may 
assume that |lcs(ri, u)| < k — 1. Then (lcs(tt, v))a = lcs(u o a,v o a). Since {u o a,v o a) € p, we get 
(lcs(ti,u))a G Ap and we are done. □ 

Given p G RC(A^), we write 

Res(p) = Res(Cay(p)). 

We refer to the elements of Res(p) as the resets of p. 

Lemma 7.8 Let p G RC(A^). Then: 

(i) Res(p) = {tc G A* I upv for all u,v € A^ n {A*w)}; 

(a) Res(p) G Xfc(A). 

Proof, (i) Let w G Res(p) and suppose that u = u'w G A^, u = v'w G A^. Since w G Res(p), we 
have paths 

u' ! W t W 

p—>p — q—yq —>r 

in Cay(p). It follows from the definition of Cay(p) that 

up = {u'w)p = r = {v'w)p = up, 


hence the direct inclusion holds. 
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To prove the opposite inclusion, we suppose that w € A* \ Res(/9). Then there exist paths 


/ W 


P 


>P, 


/ w 


in Ca,y{p) with p ^ q. If tc has a suffix w' of length k, then every path labeled by w ends necessarily 
in w'p, hence we must have |rc| < k. Since Cay(p) is strongly connected by Proposition 5.1, there 
exist paths 


P 


>P 



in Cay{p) with |xrc| = \yw\ = k. But then 


{xw)p = p ^ q = {yw)p 


and we are done. 

(ii) It is immediate that Res(p) < A*. Since every path in Cay{p) labeled hy w ^ A^ ends 
necessarily in wp, we have A^ C Res(/9) and so Res(/9) G Tk{A). □ 

We can now compare a right congruence with a special right congruence: 

Proposition 7.9 Let 1^41 > 1, p G RC(74*') and I G Xk{A). Then: 

(t) P<TTi^ApC I ^ A'p c I; 

(ii) Tj p 4^ I Q Res(/9). 

Proof, (i) Assume that p C r/. Let {u,v) G p. Then u and v have a common suffix in I, hence 

lcs(u, v) has a suffix in I and so C A*I = I. 

By (7.3), A'pT I implies Ap C I. 

Finally, assume that Ap C I. Let (u, u) G p and write w = lcs(up) G Ap C I. Since tc is a suffix 
of both u and v, we get (n, v) G tj. Thus p C t/ as required. 

(ii) Assume that r/ C p. Let w ^ I and let u,v ^ A^ r\ {A*w). Since u,v have a common suffix 
in /, we get (u, u) & tj X p. Thus w G Res(p) by Lemma 7.8(i) and so / C Res(/9). 

Conversely, assume that I C Res(/9). Let {u,v) G r/. Then we may write u = u'w, v = v'w with 

w € I C Res(/3). Since u, u G n {A*w), it follows from Lemma 7.8(i) that (u, v) € p and so tj C p. 

□ 


We can now prove several equivalent characterizations of special right congruences: 
Proposition 7.10 Let |A| > 1 and p G RC(A^). Then the following conditions are equivalent: 

(i) p G SRC(A^); 

(ii) Ics : A^/p —> A-^ is injective and Ap is a suffix code; 

(in) p = ta*a,; 

(iv) p = TA*k'/, 

(v) p TRes{p)j 

(vi) p = for some L G Ck{A); 
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(vii) Ap C Res(/?); 

(via) A'p C Res(/9); 

(ix) whenever 

aw I bw fi w 

p — >q, p — >q, p —>r 

are paths in Cay{p) with a,b € A distinct, then q = r. 
Proof. (i) (ii). We start by proving that 


(7.4) 


lcs(nT/) € I (7.5) 

for all I € d^k{A) and u E . 

Indeed, for every w E utj, there exists some w' E I such that w' <s u,w. Let z be the shortest 
suffix among the w' . Then z E I and z <s w for every w E utj, hence z <s lcs(rxr/). Since I < A* , it 
follows that lcs(ttr/) E I and so (7.5) holds. 

Assume that p = tj for some I E Xk{A). We prove that 

lcs{up) <s lcs(u/?) ^ {u,v) E p (7.6) 

holds for all u,v E A^. Assume that lcs(up) <* lcs{vp). Since lcs(up) <* u and lcs(up) <* v, it 
follows that lcs{up) is a suffix of both u and v. Now (7.5) yields lcs(up) = lcs(ur/) E I and so u,v 
have a common suffix in I. Therefore {u, v) E tj = p and (7.6) holds. 

Now (ii) follows from (7.6). 

(ii) => (hi). Write I = A*Ap. If {u,v) E p, then lcs(u/9) € Ap C / is a suffix of both u and v, 
hence {u,v) E tj. 

Conversely, let {u,v) E tj. Then there exists some w E Ap such that w <s u,v. Suppose that 
lcs(u/9) 7 ^ w. Then lcs(u/9) <s w or w <s lcs{up), contradicting Ap being a suffix code. Hence 
lcs(n/9) = w. Similarly, lcs(u/?) = w. Since Ics : A^ / p — A-^ is injective, we get up = vp. Thus 
P = Tl. 

(iii) (iv). This follows from Lemma 7.7. 

(hi) ^ (vi). Write L = A*Ap. By (iii), we have t| = = p. Since L E Ck{A) by Lemma 7.7, 

(vi) holds. 

(vi) => (i). Let I = LA* E Ik{A). Since L C /, it follows from Lemma 7.4 that C tj, hence 

tt ^ tt 

P = tICt} = ti 


by Proposition 7.2. 

Now assume that {u,v) E tj. Then there exist factorizations u = u'w and v = v'w with w E I. 
Write w = zw' with z E L. Then {w'u'z, w'v'z) E tl and so 

[u,v) = [u w,v w) = [u zw ,v zw ) = [w u z o w ,w V z o w ) E tI = p. 

Thus Tl C p as required. 

(i) (v). If p = Tj for some I E Zk{A), then I C Res(/?) by Proposition 7.9(ii). Since 

Res(/9) E Ik{A) by Lemma 7.8(ii), then Proposition 7.9(ii) also yields 


’Tles(p) — P Til 

27 


hence Res(/9) C / by Lemma 7.4. Therefore I = Res{p). 

(v) => (vii) (viii). By Lemma 7.8(ii), Res(p) € Xk{A). Now we apply Proposition 7.9(i). 

(vih) => (i). We have 74*Ap, Res(/9) € Xk{A) by Lemmas 7.7 and 7.8(ii). R follows from Proposi¬ 
tion 7.9 that 

TRes{p) C P C TA*A'p- 

Since C Res(/9) yields A*h'p C Kes{p) and therefore ^ '^Res(p) by Lemma 7.4, we get 

P = rRes(p) eSRC(A^). 

(viii) => (ix). Consider the paths in (7.4). Since A^ C Res(p) by Lemma 7.8(ii), we may assume 
that \w\ < k. Since Cay{p) is strongly connected, there exist paths 



such that xaw,x'bw G A^. Hence 

w = lcs{xaw,x%w) € Ap C Res{p) 

and so q = r. 

(ix) (viii). Let w G A),. Since A^ C Res(/9) by Lemma 7.8(ii), we may assume that |t(;| < k. 
Then w = lcs(tt,u) for some distinct p-equivalent u, u G A^. Hence we may write u = u'aw and 
V = v'bw with a,b & A distinct. Since up = up, it follows that there exist in Cay(p) paths of the 
form 

u' aw j v' / bw 

S — >p — >up^ s —>p — >vp. 

Now (ix) implies that w G Res(p). □ 


Corollary 7.11 If p € SRC(A^) with |A| > 1, then Ap is a semaphore code. 

Proof. By Proposition 7.10(ii), Ap is a suffix code. Furthermore, by Lemma 7.7 we have A*Ap G 
Zk{A), which in turn implies by Proposition 7.2 that {A*Ap)l3i = Ap is a semaphore code. □ 


We can now prove that not all right congruences are special, even for |A| = 2: 

Example 7.12 Let A = {a,b} and let p be the equivalence relation on defined by the following 
partition: 

{a^, aba, ba"^} U {bab, a^b} U {ab^} U {b^a} U {b^}. 

Then p G RC(A3) \ SRC{A^). 

Indeed, it is routine to check that p G RC(A^). Since lcs(a^p) = a and lcs((6^a)p) = 6^a, then Ap 
is not a suffix code and so p ^ SRC(A^) by Proposition 7.10. 

Let p G RC(A^) and let 


p = V{t G SRC(A^) I r C p}, 
p = A{t G SRC(A'') I r D p}. 


(7.7) 


By Proposition 7.6(ii), we have p, p G SRC(A*). 
Proposition 7.13 Let |A| > 1 and p G RC(A^). Then: 


(‘^) P_ Rles(p); 
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(ii) p — TA*Ap — TA*A'p- 

Proof, (i) By Lemma 7.8(ii), we have Iies{p) € Xk{A). Now the claim follows from Proposi¬ 
tion 7.9(ii). 

(ii) Similarly, we have A*Kp = A*A!p € Xk{A) by Lemma 7.7, and the claim follows from Propo¬ 
sition 7.9(i). □ 


The next counterexample shows that the pair {p,p) does not univocally determine p G RC(A^): 
Example 7.14 Let A = {a, b} and let p, p' be the equivalence relations on ^ defined by the following 
partitions: 

{a^, aba, ba^} U {bab, a^b} U {ab^} U {b^a} U {b^}, 

{a^, 6^a, 6a^} U {bab, a'^b} U {ab^} U {aba} U {6^}. 

Then p, p' G RC(A^), p = p' and fi = p'. 

Indeed, we claimed in Example 7.12 that p is a right congruence, and the verification for p' is 
also straightforward. 

It is easy to see that 

Res(p) = A*A^ U {a^,ah} = Res(p'), 


hence p = p' by Proposition 7.13(i). 
Since 

Ap 

and 

Ap/ 

we obtain 

A*Ap 

and Proposition 7.13(ii) yields p = p'. 


{a, ah, afi^,fi^a, b^} 
{a, ab, ah^, aba, b^} 
A+\{b,b^} = A*Ap, 


This same example shows also that p does not necessarily equal or cover p in SRC(A^). Indeed, 
in this case we have 


Res(p) = u {a‘^,ab} c I C A+ \ {6, b^} = A*Ap 
for I = A*A^ U {a‘^,ab, ba] G Xk{A). By Lemma 7.4, we get 


p CTi C p. 


8 Random walks on semaphore codes 

As we have seen in Proposition 7.13, semaphore codes approximate right congruences from above 
and below in the lattice structure. In this section, we will define random walks (or more specifically 
Markov chains) on semaphore codes. The property that makes this possible is that for a semaphore 
code S associated to the alphabet A 

SA C A*S, (8.1) 

see Lemma 4.1. Namely, (8.1) implies a right action of A on S: for a & A and s £ S, the action s.a 
is t, if sa = wt with w £ A* and t £ S under (8.1). 


29 


To turn the action S x A ^ S into a random walk, we impose a Bernoulli distribution on A*, 
see [6, Section 1.11], More precisely, we associate a probability 0 < 7r(a) < 1 to each letter a € A 
such that YlaeA'^i^) ~ state space of the random walk is S. Given s € S, with probability 

7r(o) we transition to state s.a in one step. This gives rise to the transition matrix T with entry in 
row s and column s' 

^(«)- 

a 

with s'=s.a 

Since Yla '^(®) ~ follows that the row sums of T are equal to one, so that T is a row stochastic 
matrix. Taking i steps in the random walk is described by the £-th power of T, that is, the probability 
of going from s to s' in £ steps is the (s,s')-entry {'T^)s,s' in Under the Bernoulli distribution, 
the probability 7r(ai • • • a^) of a word of length i is given by the multiplicative formula 7r(ai ■ ■ ■ ai) = 

n5.i >'(«.). 

A suffix code X on A* is maximal if it is not properly contained in any other suffix code on A*, 
that is, if X C y C A* and U is a suffix code, then Y = X. Furthermore, X is called thin if there 
exists an elements w ^ A* such that A*wA* nX = 0. By [6, Proposition 3.3.10], for a thin maximal 
suffix code X we have vr(X) = YIxgx '^{x) = 1 for all positive Bernoulli distributions tt on X. A 
Bernoulli distribution on X is positive if 7r(x) > 0 for all x £ X. As shown in [6, Proposition 3.5.1], 
semaphore codes S are thin maximal suffix codes, so that 

ttCS") = ^7r(s) = 1. (8.2) 

seS 

Hence any positive Bernoulli distribution on semaphore codes yields a probability distribution. 

A stationary distribution I = {Is)s£S is a vector such that Ylses = i- and IT = I, that is, 
it is a left eigenvector of the transition matrix with eigenvalue one. In the finite state case, by the 
Perron-Frobenius Theorem, the stationary distribution exists. It is unique if the random walk is 
irreducible. See [13] for more details. In our case, we prove next that a stationary distribution exists 
and give its explicit form. 

Theorem 8.1 The stationary distribution of the random walk associated to the semaphore code S 
is given by 

I = {7r{s))seS ■ 

Proof. Taking the s^-th component of IT = I reads 

Y Y (8-3) 

s^S q-GaI 
s'=s.a 

Recall that s.a = s' with a £ A and s,s' £ S means that sa = ws' for some w £ A*. In particular, 
this can only hold if a is the last letter of s' and hence fixed by s'. 

Claim: The set 5' = {re | sa = ws', s G S'} for fixed s' £ S with a £ A the last letter of s', is a thin 
maximal suffix code. 

Indeed, if the claim is true, we have = 1 by [6, Proposition 3.3.10]. Using that 

7r(a)7r(s) = tt{w)it{.s') we can hence rewrite (8.3) 

Y Y ^(«)^('S) = Y 

sGS aeA weS' 

s'=s.a 
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as desired. It remains to prove the claim. 

First assume that S' is not a suffix code. Then there must be two elements w, w' G S' that 
are comparable in suffix order. But then ws' and w's' are comparable in suffix order, contradicting 
the fact that S' is a suffix code (since after removing the last letter a the result must be in S). 
Next assume that S' is not maximal. This means there exists y G A* such that S' O S' D {y} is 
a suffix code. But then S U {y^} is a suffix code, where is obtained from s' by removing the 
last letter a, contradicting the maximality of S (recall that all semaphore codes are maximal by [6, 
Proposition 3.5.1]). Finally assume that S' is not thin. That means that there exists w G A* such 
that A*wA* n S" 7 ^ 0. In particular uwv G S' for some u,v G A*. Since by construction S''s' C S, 
this would imply uwvs' G S, contradicting the fact that S is thin. □ 

Given A = {oi,..., a^} and a right congruence p G RC(A^), we are interested in the probability 
for nonempty words of length f < A: to be resets on A^/p. Since Res(/o) = Res(/9) by Propositions 7.10 
and 7.13, we can restrict ourselves to determine the probabilities for resets of words of given length 
for p G SRC (71*^), or equivalently for semaphore codes Ap by Corollary 7.11. 

Theorem 8.2 Let p G RC(A*^). Then the probability that a word of length 1 < £ < k is a reset on 
A^ /p is given by 

seAp aes 
£(s)<i 

where a G s in the product runs over every letter in s and £{s) is the length of the word (or suffix) s. 
Proof. As mentioned above, Res(/?) = Res(/9) by Propositions 7.10 and 7.13 and in addition Ap is a 
semaphore code. Define Res(.£) = {w G A+ | £(w) = £ and w is a reset on A^/p} = Res(/?) D A^. We 
claim that 

Res(£) = {tc G A'^ I £{w) = £ and w has a suffix in Ap}. 

Since Ap is a suffix code, each word has precisely one suffix in Ap. Hence the claim immediately 
yields the formula for P{£) using that a letter a G s for s G Ap occurs with probability 7r(a). 

We prove the claim by induction on £. By Proposition 7.10(vii) we have that Ap C Res(/9) = 
Res(/9). Certainly, for £ = \ the only words that are resets are the words/suffixes of length 1 in Ap. 
Now assume that the claim holds for all words of length less than £. Since Ap C Res(p), we deduce 
that 

{w G A'^ I £{w) = £ and w has a suffix in Ap} C Res(.£) . 

To prove the reverse inclusion let u = ... aq G Res(.^). If u G Ap, we are done. If ■ ■ ■ Oii G 

Res(£ — 1), then by induction v has a suffix in Ap. Hence assume that 0 Res(£ — 1) and 

v ^ Ap. This requires that is a reset, so that again by induction has a suffix s in 

Ap. Since Ap is a semaphore code and hence ApA C A*Ap, we have that if s G Ap, then soq G A*Ap. 
In all cases v has a suffix in Ap. This concludes the proof of the claim. □ 

Example 8.3 Take the special right congruence p given by congruency classes {aaa, baa, aba, bba}, 
{aab,bab}, {abb}, {bbb} with corresponding semaphore code Ap = {a,ah,abh,hhh}. The probability to 
have a reset for words of length £ is 

P{1) = Tr{a) 

P{2) = 7r(a) + 7r(a)7r(6) 

P(3) = 7r(a) + 7r(a)7r(6) + 7r(a)7r(6)^ + TT{b)^ = 7r(a) + 7r(a)7r(6) + 7r(6)^ = 7r(a) + 7r{b) = 1, 
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where for P{3) we have used repeatedly that 7r(a) + 7r(6) = 1. 

Example 8.4 Take the semaphore code 

{aa, aab, aba, abba, babb, aabb, bbab, abab, bbba, aabb, babbb, abbbb, bbbbb} , 

which corresponds to a special right congruence, which is easy to check by Proposition 7.10. Then 
we have 

P(1) = 0 
P(2) = n{af 

P(3) = 7r(a)^ + 27r(a)^7r(6) 

P(4) = 7r(a)^ + 27r(a)^7r(6) + 37r(a)^7r(6)^ + 37r(a)7r(6)^ = vr(a)^ + 27r(a)^7r(6) + 37r(a)7r(6)^ 

= vr(a)^ + 27r(a)7r(6) + 7r(a)7r(5)^ = 7r(o) + 7r(a)7r(6) + 7r(a)7r(6)^ 

P(5) = 7r(a) + 7r(a)7r(b) + 7r(a)7r(b)^ + 7r(a)^7r(b)^ + 27r(a)7r(b)^ + 7r(6)^ 

= 7r(a) + 7r(a)7r(6) + 7r(a)7r(6)^ + 7r(a)7r(5)^ + 7r(6)^ 

= '7r(a) + 7r(a)7r(6) + 7r(a)7r(6)^ + vr(6)^ = 7r(a) + 7r(a)7r(6) + vr(6)^ 

= 7r(a) + 7r(6) = 1 , 

where again we repeatedly used that 7r(a) + 7r(b) = 1. 

The probability P(£) to reach a reset in £ steps is related to the hitting time (see [13, Chapter 
10]). Namely, given a Markov chain with state space S, the hitting time tji of a subset i? C S' is the 
first time one of the nodes in R is visited by the chain. We are interested in the hitting time tRes(p) 
for p € RC(74^). Set 

p{£)=p{£)-p{£-i)= Y. 

(x^s 

Then 

k 

tResip) = 

£=1 

Note that by Definition 2.2, we also have a right action of A on right congruences p G RC(T^), 
namely p x A ^ p. Hence, as for semaphore codes, we can define a random walk on p by assigning 
a probability vr(a) for each a ^ A. Recall that by its definition in (7.7), p is a refinement of p. Let 
us relate these various random walks. A step s.a = t for s,t G Ap and a G A in the random walk 
on the semaphore code Ap is in one-to-one correspondence to a step Cg.a = ct in the random walk 
on p G SRC(A*), where Cs,ct G p are the unique congruences such that lcs(cs) = s, lcs(ct) = t, 
respectively. Since p is a refinement of p, a step Cg.a = ct on p implies a step c.a = d on p whenever 
Cs C c and Ct C d. In particular, the transition matrix T for the random walk on the semaphore 
code Ap satisfies for a fixed d G p 

E T"..* = E for all s, s' G Ap such that Cg' p Cg. (8.5) 

4GAp teAp 

ctCd ctCd 
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This relation is precisely the condition for a Markov chain to be lumpable. Lumpability was first 
introduced by Kemeny and Snell [12], see also [13, Section 2.3.1], This means that the transition 
matrix on. p indexed by right congruence classes c,d ^ p can be expressed in terms of T as follows 

t for any s € Ap such that Cg C c. 

teAp 

ctQd 

The theory of lumpability (or projection) then gives us the stationary distribution for T^. 
Proposition 8.5 Let = {Ic)c£p be the stationary distribution for TP. Then 

7r(s). 

sGAp 

CsCc 

Proof. By lumpability, we have 

4' = E 4. 

sGAp 

CsCc 

where I = {Is)seAp is the stationary distribution of T. By Theorem 8.1 we have Ig = '^(s). □ 

Remark 8.6 We could have derived an expression for T also directly from the stationary distribu¬ 
tion of the delay de Bruijn random walk by lumping given as 

Ic = 

x£c 
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